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Abstract 

In a missing-data setting, we have a sample in which a vector of explanatory variables x, is observed 
for every subject i, while scalar outcomes yi are missing by happenstance on some individuals. In 
this work we propose robust estimates of the distribution of the responses assuming missing at 
random (MAR) data, under a semiparametric regression model. Our approach allows the consistent 
estimation of any weakly continuous functional of the response's distribution. In particular, strongly 
consistent estimates of any continuous location functional, such as the median or MM functionals, are 
proposed. A robust fit for the regression model combined with the robust properties of the location 
functional gives rise to a robust recipe for estimating the location parameter. Robustness is quantified 
through the breakdown point of the proposed procedure. The asymptotic distribution of the location 
estimates is also derived. 



1 Introduction 

Suppose we have a sample of a population, such that for every subject i in the sample we observe a vector 
of explanatory variables Xj while a scalar response yi is missing by happenstance on some individuals. 
A classical problem is to construct consistent estimators for the mean value of the response based on 
the observed data. In order to identify the parameter of interest in terms of the distribution of observed 
data, missing at random (MAR) is assumed. 

This hypothesis establishes that the value of the response does not provide additional information, 
on top of that given by the explanatory variables, to predict whether an individual will present a missing 
response (see Rubin |15j). To be more rigorous, let us introduce a binary variable a, such that a, = 1 
whenever the response is observed for subject i. In this way, MAR states that 

P(a, - l|Xi,tfi) =P(o< = l|xi). (1) 

Under this condition, if P(a, = l|xj) > 0, we have that 

OiV; 



Efo] = E 



7r(Xi) 



(2) 



where 7r(xi) =P(a i = l|xj), and idcntihability of E[y,] holds. One approach to estimate consistently E[j/j], 
called inverse probability weight (IPW), is based on @ and requires to estimate the propensity score 
function 7r(x). Then, the estimate of E[yj] can be obtained replacing in ([2]) 7r(xj) by its estimate and the 
expectation by its empirical version. MAR also implies that the conditional distribution of the responses 
given the vector of explanatory variables remains the same, regardless of the fact that the response is 
also observed: y t \x t — ^x^a, = 1. Then E^x*] = E[yj|xj,a, = 1]. Since E[j/i] = E [E[7/i|xj]], a 
second approach to estimate E[?/j] is based on a regression model (parametric or nonparametric ) for 
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E[yj|xj] = <?(x;), which is fitted using only the individuals for whom the response is observed. Then a 
second estimate for E[yj] is obtained by averaging <7(x,) over the whole sample, where g is an estimate of 
g. There is third approach (doubly protected) that postulates models for 7r(x) and g(x) and obtains a 
consistent estimate of E[j/<] if at least one of the two models is correct. A recent survey and discussion on 
these three approaches can be found in Kan and Schafer |12j and Robins, Sued, Lei-Gomez and Rotnitzky 

As it is well known, the mean is not a robust location parameter, i.e., a small change in the population 
distribution may have a large effect on this parameter. As a consequence of this, the mean does not 
admit consistent non-parametric robust estimates, except when strong properties on the distribution 
are assumed, as for example symmetry. For this reason, to introduce robustness in the present setting, 
we start by reformulating the statistical object of interest: instead of estimating the mean value of the 
response, we look for consistent estimates of Tl(Fq), where Tj, is a robust location functional and F 
is the distribution of yi. Bianco, Boente, Gonzalez-Manteiga and Perez-Gonzalez [T] used this approach 
to obtain robust and consistent estimates of an M location parameter of the distribution of j/j. In their 
treatment they assumed a partially linear model to describe the relationship between yi and Xj, and 
also that the distributions of the response yi and of the regression error under the true model are both 
symmetric. 

In this paper we introduce a new estimate of any continuous location functional assuming that the 
relation between yi and Xj is given by means of a semiparametric regression model. We show that once the 
regression model is fitted using robust estimates, we can define a consistent estimate of the distribution 
function of the response. Then, any parameter of the response distribution defined throughout a weak 
continuous functional, may be also consistently estimated by evaluating the functional at the estimated 
distribution function. The consistency of this procedure does not require the symmetry assumptions used 
by Bianco et al. pQ. 

A robust fit for the regression model combined with the robust properties of the location functional to 
be considered, gives rise to a robust recipe for estimating the location parameter. Robustness is quantified 
looking at breakdown point of the proposed procedure. In particular our results can be applied when the 
location functional is the median or an MM location functional. 

The proposed procedure may be considered as a robust extension of the second approach described 
above for estimating E[j/j]. We have not found a way to robustify the approaches that use the propensity 
score 7r(x). The main difficulty in such cases is to obtain a consistent procedure avoiding the assignment 
of very large weights to those observations with very small 7r(xj). 

This work is organized as follows. In Section 2 we formalize the problem of the robust estimation 
of a location parameter with missing data. We propose a family of procedures which depend on the 
location functional to be estimated and also on the robust regression estimate for the parameter of the 
regression model postulated to describe the relationship between x, and yi. In Section 3 we show that, 
under some assumptions on the location functional and the regression estimate, the proposed estimates 
are strongly consistent and asymptotically normal. In Section 4 we study the breakdown point of the 
proposed estimates. In Section 5 we show that when the location and regression estimates arc of MM 
type, the assumptions that guarantee consistency and asymptotic normality of the proposed estimates 
are satisfied. In Section 6 we present the results of a Monte Carlo study which shows that the proposed 
estimates are highly efficient under Gaussian errors and highly robust under outlier contamination. Proofs 
are presented in the Appendix. 
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2 Notation and Preliminaries 



Wc first introduce some notation. Henceforth Eg[/i(z)] and Pq (A) will respectively denote the expecta- 
tion of h(z) and the probability that z £ A, when z is distributed according to G. If z has distribution 
G we write z ~ G or 2? (z) = G. Weak convergence of distributions, convergence in probability and con- 
vergence in distribution of random variables or vectors are denoted by G n —± w G, z„ — > p z and z n — > c i z, 
respectively. By an abuse of notation, we will write z„ — >a G to denote V (z n ) — > w G. We use op(l) to 
denote any sequence that converges to zero in probability. The complement and the indicator of the set 
A are denoted by A c and 1a, respectively The scalar product of vectors a, b £ R s is denoted by a'b. 
R+ denotes the set of positive real numbers. 

Along this paper we use the expression empirical distribution of a sequence on n points Zi, Z2, z„ 
in R fc to denote the function F n : R fe — > [0, 1] such that given z £ R fc , F n {z) = m/n, where m is the 
number of points z,; such that all its coordinates are smaller or equal than the corresponding ones of 
z. 

2.1 Describing our setting: the data, the problem and the model 

Throughout this work, we have a random sample of n subjects and for each subject i in the sample, 
1 < i < n, a vector of explanatory variables x, is always observed, while the response yi is missing on 
some subjects. Let a, be the indicator of whether j/j is observed at subject i: a, = 1 if yi is observed and 
ttj = if it is not. 

We will be concerned with the estimation of a location functional at the distribution of the response. 
A location functional Tl, defined on a class of univariate distribution functions Q, assigns to each F £ 
Q a real number Tl{F) satisfying Tz(F ay+ b) = aTz(F y ) + 6, where F y denotes the distribution of the 
random variable y. 

Example of locations functionals are the mean and median. Another important class of location 
functionals that includes the mean and median and other robust estimates is the class of M location 
functionals. This class also includes S and MM estimators that will be described in Section 4. We should 
also mention the class of L location functionals, see e.g. Chapter 2 of Maronna, Martin and Yohai |13j . 
but we do not study this class in this work. 

A functional T is said to be weakly continuous at F if given a sequence {F n } of distribution functions 
that converges weakly to F (F n —} w F), then T(F n ) — > T(F). In order to obtain a consistent estimate 
of a location parameter defined by means of a weakly continuous functional, it is sufficient to have a 
sequence of estimates F n such that converges weakly to the distribution of the y^s. 

To be more precise, denote by Fq the distribution of the outcomes yi. Let Tl be a weakly continuous 
location functional at Fq . We are interested in estimating 

Mo = Tl(F ). 

We assume a semiparametric regression model 

yi = ff(x i; p ) + Ui, l<i<n, (3) 

with yi,Ui £ R, Xj £ R p , Ui independent of x*, (5q £ B C R 9 , g : R p x B — >• R. Furthermore, in order to 
guarantee the MAR condition, we assume that Ui is independent of (xj,dj). We denote by Qo and Ko 
the distributions of x, and Ui, respectively 

To identify /3q, without assuming that either (i) Kq is symmetric around or (ii) Kq satisfies a 
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centering condition, (as, e.g. , E^ u = 0) we assume that 

P Qo ( 5 (x,/3 ) = .g(x,/3) + a)<l (4) 

for all /3t^/3 , for all a. This condition requires that in case there is an intercept, it will be included in the 
error term instead as of a parameter of the regression function g(x,j3). For linear regression we have 
g(x,/3) =/3'x and then this condition means that the vector x^ is not concentrated on any hyperplanc. 



2.2 The proposal 

Recall that Ko denotes de distribution of Ui and let i?o denotes the distribution of g(x^, /3q). Independence 
between x^ and Ui guarantees that Fq is the convolution between Rq and Kq. Then by convoluting 
consistent estimators R n and K n of each of these distributions, we get a consistent estimator for Fq. 

In order to estimate Rq and Kq we need to have a robust and strongly consistent estimator /3 n of (3q. 
This estimator may be, for example, an S estimate (see Rousseeuw and Yohai [T7]) or an MM-estimate 
(see Yohai |20|). Since Ui is independent of a^, /?„ may be obtained by a robust fit of the model using the 
data for which yi is observed: i.e., using the observations (xj, j/j) with a; = 1. Let R n be the empirical 
distribution of g(xj,/3„), 1 < j < n defined by 

1 ™ 

3=1 

where 5 S denotes the point mass distribution at s. 

Let A = {i : ai — 1} and m — #A. For i € A consider 

Ui = yi~- g{-X4,P n ). 

The estimator K n of Kq is defined as the empirical distribution of : i 6 A}: 

1 1 " 

K n = — V Su, = V QiS ^ ■ ( 6 ) 

m * — ' > ■ , a, ' — ' 

Then, we estimate Fq by F n = R n * K n , where * denotes convolution. Note that R n * K n , is the 
empirical distribution of the nm points 

VH = 9(xj,Pn) + Ui, l<j<n, i€ A, 

and therefore we can also express F n as 



^ n n 

^ = — J2 J2 5 ^ = ^r* „X S ^« • ( 7 ) 

Finally, we estimate no by 



nm' — ' 1 — ' J n ) ■ , di 

i£A j=l ^i=l l i£Aj=l 



V»n = T L (F n ). (8) 

Since we have assumed weak continuity of Tl at Fq , in order to prove that ju„ is a strongly consistent 
estimate of /iq we only need to prove that F n — > w Fq a.s. Observe that 



The right hand side of this equation was proposed by Miiller P3] to estimate Ei? /i(y). 
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3 Consistency and asymptotic distribution 



Let (xj,2/i) and it, satisfy model ([3]), with m independent of (xj,aj). Denote by Go, Qo and Ko the 
distributions of (xj,?/j), x, and it,, respectively, and denote by Gg and Qq the distributions of (xj,?^) and 
Xj conditioned on a; = 1, respectively. 

The MAR condition implies that under Gq model <j3j) is still satisfied with x* and u* independent, 
x* with distribution Qq and u* with distribution Kg. We also assume that the regression function g 
satisfies following assumption: 

AO <?(x,/3) is twice continuously differcntiablc with respect to (3 and there exists S > such that 



E Qo SU P lls( x i>/3)ll < oo and E q sup ||ff(xi, (3)\\ < oo, 

ll/3-/3oll<<5 \\P-Po\\<6 



(9) 



where <?(x,/3) and g (x, /3) denote the vector of first derivatives and the matrix of second derivatives 
of g respect to /?, respectively 

In order to prove the consistency and the asymptotic normality of fl n the following assumptions on 
Pn and Tl are required. 

Al {/3 ra } is strongly consistent for /3q. 

A2 The regression estimate /3 n satisfies 



1 ™ 

Vn(Pn - Po) = ^r,aiI R (xi,yi) +o P (l), 



(10) 



1/: 

i=l 

for some function Ir(x, u) with Ea;/^(xi,yi) = and finite second moments. 
A3 Tl is weakly continuous at F . 
A4 The following expansion holds: 

yfa (T L {F n ) - Tl(Fo)) = V^EpjUy) + op(1), (II) 

for some differentiable function iz,(2/) with Ef Il(u) = 0, Ep I L (y) < oo and bounded. 

It can be shown that when expansion holds, i^, is given by the influence function (as defined 
by Hampel (1974)) of Tl at F . When (3 n is obtained using a regression functional, a similar statement 
holds. 

The following Theorem shows the consistency of ju„ = T(F n ). 

Theorem 1 Let F n be defined as in ([?]) and assume that Al holds. Then (a) {F n } converges weakly to 
Fq a.s., i.e., 

P(F„ ^ w F ) = 1. 

(b) Assume also that A3 holds; then ju n = Tl{F„) converges a.s. to fiQ = Tl(Fq). 

In order to find the asymptotic distribution of p, n , consider 

n = Eoi, c = E [at V L (y\ - g(f3 , xi) + p(/3 , x 2 )) {g(/3 , x 2 ) - <K/3 , xi)}] , 



e(-Xi,Ui,ai) = E ajJ Tj . Fo (itj + gfe, /3 ))\ui,ai 
/(x,)=E 



= UiE 



(uj +g(-Xj,p ))\ui,ai 



a ^T L , Fo («« + .9( x j, /8o))|xj 



rE 



{e(x l; M 4 , Oi) + /(xj) + a l c'l R (y. i , m)}' 
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Then, the following Theorem gives the asymptotic normality of the estimate /2 n , defined in (jSJ. 
Theorem 2 Assume A0-A4- Then 

n 1 / 2 ^ - fi Q )^ d N(0,T 2 ). (12) 



3.1 The median as location parameter 

The median is one of the most popular robust location parameters. However, since this estimate does 
not satisfy A4, we cannot prove its asymptotic normality using Theorem [TJ In this section, we will prove 
consistency and asymptotic distribution for the median of F n , defined at (|7|), assuming that AO holds 
and that {(3 n } satisfy Al and A2. 

The functional T mc d is defined by 

T me d(F) = argminE F |y - fx\. (13) 

When there are more than one value attaining the minimum, the functional is defined by choosing any 
of them. We have the following result, whose proof needs an extra argument to compensate the absence 
of differentiability of ^T mod .F (y)- 



Theorem 3 Assume that fj® = T rnec i(Fo) is well defined and let ju„ = T mec i(F n ). Suppose that Fq is 
continuous and strictly increasing at (Iq- Then, (a) under Al we have ju„ — > /io a.s. 
(b JAssume A0-A2. Assume also that Fq and Kq have continuous and bounded densities /o and kg 
respectively, and that fo(po) > 0. Then 

n 1/2 (£ n -/xo)->d- 2 V(0,T 2 ), (14) 
where r 2 is as in Theorem [H with c replaced by 

1 



vMw>) 

and lT L ,F {y) replaced by 

lT med ,F (y) 



E[aifc (-5(x2,^o) + Mo){3(x2,/3o)-5(xi,^o)}] (15) 
sign(y - fi a ) 



2/o(Mo) 



4 Breakdown point 

Consider first a dataset of n complete observations Z = {z, , .., z ra }, where Zj £ W , and let n (Z) be an 
estimate of a parameter 9 <E K fe defined on all possible datasets. Donoho and Huber [3] define the finite 
sample breakdown point (FSBP) of 9 n at Z by 

£*(?„, Z) = mini- : sup ||0„(Z')|| = ool , 
I" z*ez s J 

where 

n 

Z s ={Z* = {zt,..., Z ;}:^/{z*^zJ< S }. 
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Then e* is the minimum fraction of outliers that is required to take the estimate beyond any bound. 
Now, we extend the notion of FSBP to the present setting, where there are missing data, as follows. 

Let 

W = {(xi,yi,ai),....(x„,y„,a„)} (16) 

be the set of all observations and missingness indicators, and let A = {i : 1 < i < n, flj = 1}, 
to = #A Denote by W s t the set of all samples obtained from W by replacing at most t points by 
outliers, with at most s of these replacement corresponding to the non missing observations. Then W* = 
{(x$,y$,ai),....(xl,yl,a n )} belongs to W M if 

£ J{(x* )2/ *) ( Xl , Vl )} + £ 7{x* ? xj < t 

and 

£/{(x*,y*)^(x^)} <s. 
Given an estimate fL n of /Lto, we define 

M u = sup |/2„(W*)| 

w*ew t , s 

and 

/ \ ft s\ 

K{t, s) = max — , — . 

\n to / 

Then, we define the finite sample breakdown point (FSBP) of an estimate ju„ at W 

e* = min{«;(t, s) : Mt s = oo}. 

Then e* is the minimum fraction of outliers in the complete sample or in the set of non missing obser- 
vations that is required to take the estimate beyond any bound. 

In order to get a lower bound for the FSBP of the location estimate p, n introduced in ©, we need 
to define the uniform asymptotic breakdown point of Tl as follows: 

Definition 4 Given a functional Tl, its uniform asymptotic breakdown point (UABP) e^j{Tif) is defined 
as the supremum of all e > satisfying the following property: for all M > there exists K > depending 
on M so that 

Pf(M < M) > 1 - e implies \T L (F)\ < K. (17) 

For any location functional Tl we have that £\j{Tl) < 0.5. This is an immediate consequence of the 
following two facts: (a) e* A ( Tl,F) < 0.5, for all location functional Tl and all F, where e* A ( Tl,F) 
is the asymptotic breakdown point of Tl at the distribution F, while (b) e^j(Tl) < e* A ( Tl,F) for all 
F. In the case that Tl is the median it is immediate to show that e\j = 0.5. In fact, for any e < 0.5, 
choosing K — M we get that (fT7|) holds. This proves that e\j > 0.5 and therefore = 0.5. 

The following Theorem gives a lower bound for the FSBP of the estimate fi n defined in (JSJ . 

Theorem 5 Let W be given by I116\) and let Z = {(xi,j/j) : i € A}. Suppose that f3 n =f3 m (Z), where fS m 
is a regression estimate for samples of size to. Let E\ > be the FSBP at Z of (3 m and call £2 > the 
UABP ofTL- Then the FSBP e* of the estimate ju n at W satisfies 

£* > £3 = min(ei, 1 - \A - £2)- 
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In the next Section we introduce MM estimates of regression and location. The maximum value of 
£i for an MM estimate of regression is (n — c(G*))/(2n), where c(G) is defined be (j2"3"]l (see Martin et 
al. [E] )■ In Theorem [S] we show that maximum value of £2 for an MM estimate of location is 0.5. Then, 
if c(G* )/n is small, we can have have £3 close to 1 — V0.5 = 0.293. A similar statement holds when we 
change Tl by the median. 



5 MM Regression and Location Functionals 

Several robust estimates for the parameters of the regression model ([3]) based on complete data (xi , y\ ),..., (x„ , y„) 
have been proposed. In this paper we will consider MM estimates. These estimates were introduced by 
Yohai [20] for the linear model while Fasano, Maronna, Sued and Yohai [6] extended these estimates to 
the case of nonlinear regression. For linear regression, MM estimates may combine the highest possible 
breakdown point with an arbitrarily high efficiency in the case of Gaussian errors. It will be convenient 
to present MM-estimates of (3o in their functional form, i.e., as a functional T m m ,p{G) defined on a set 
of distributions in R p+1 , taking values in W . Given a sample (xi,yi), . . . , (x„,y„) the corresponding 
estimate of /?o is given by Pmm = TMM,p(G n ), where G n is the empirical distribution of the sample. 
As we explained in the Introduction, we have excluded the intercept in model ©. However in order 
to guarantee the consistency of the estimates without requiring symmetric errors it is convenient to esti- 
mate an additional parameter which can be naturally interpreted as an intercept or a center of the error 
distribution. For this purpose put £= a) with a g R, and define <?(x,£) = <?(x,/3) + a. 

To define a regression MM functional Tmm(G) = (Tmm,/3(G), Im j\/.a(G)) two loss functions, p 
and pi are required. The function po is used to define a dispersion functional S(G) of the error 
distribution. Then Tmm is defined as a regression M functional with loss function p\ and scale given by 
S(G). 

Throughout this work, a bounded p- function is a function p (t) that is a continuous nondecreasing 
function of \t\, such that p(0) = 0, p (00) = 1, and p(v) < 1 implies that p(u) < p(v) for \u\ < \v\. We 
also assume that pi{t) < po(t) for all t. 

We start by defining the dispersion functional. For any distribution G of (x, y) and £ = (/3, a), let 
S*(G,0 be defined by 

E „to^=*, (18) 



S*(G,«) 

where S € (0, 1). Then the dispersion functional S(G) is defined by 



S(G) = min 5*(G,0 (19) 



and the MM estimating functional Tmm(G) = (Tmm, p(G), Tmm, a(G)) by 

'y- g(*,0 



Tmm(G) = arg min E G 



Pi 



S(G) 



(20) 



We can also consider another regression functional Ts(G) = (Tg^(G), Ts, Q (G)), called regression 
S functional, as follows: 

T s (G) = arg min E G 

££BxR 



Po 



(21) 



S(G) 

In the case of linear regression, the asymptotic breakdown point of both Tmm an d is given by 

e* =min(£,l- S- c(G)), (22) 
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where 

c(G) = sup P G ( 7 '(xM)' = 0). (23) 

The maximum breakdown point occurs when 5 = (1 — c(G))/2 and its value is (1 — c(G))/2. It can 
be proved that this is the maximum possible breakdown point for equivariant regression functionals. In 
the case of nonlinear regression both Tmm and T5 have also the same breakdown point but it is not 
given by a simple closed expression (see Fasano [S]). 

Yohai J20j showed that MM estimates for linear regression may combine the highest possible break- 
down point (1 — c(G))/2 with a Gaussian efficiency as high as desired. Instead, Hossjer [IT] showed that 
this is not possible for S estimates. The maximum asymptotic Gaussian efficiency of an S estimate with 
s* = (1 - c(G))/2 is 0.33. 

Let (x, y) and u satisfy model Let {G* } be the sequence of empirical distribution associated with 
observed pairs (x^, y^), i.e., those pairs such that a% = 1 : 

1 n 

G* n = ^n ^2^ S {^,y t )- ( 24 ) 



En 



1— 1 



Then we can estimate /3o by 

Pn = Ti/a;,/j(G*). (25) 

We can also choose as location functional Tl, whose value at po = Tl(Fq) we want to estimate, a 
location MM functional. MM and S location functionals are defined similarly to the regression case. Let 
Pi and Pq be bounded p-functions. We start by defining the dispersion functional. For any distribution 
F of y and /i£M let Sl(F, p) be defined by 

where 5 £ (0, 1). Then the dispersion functional Sl{F) is defined by 

S L (F)=mmS* L {F,p) 



and the MM location functional T^ M (F) by 



Tmm( f ) = argminE^ 



(sl(F)) 



(26) 



The S location functional Tg(F) is defined similarly to the regression S functional. We denote by 
/100 = Tg(Fo) and pen = T-^ im (Fq), whenever they are well defined. Location MM estimates may also 
combine high breakdown point with high Gaussian efficiency and their breakdown point is given by 
e* = min(<5, 1 — S). 

For the validity of assumptions A1-A4, the p-functions used to define the location and regression MM 
functionals should satisfy assumptions Rl and R2 below. 

Rl For some m, p(u) = 1 iff |u| > m, and log(l — p) is concave on (~m,m) . 

R2 p is twice continuously diffcrentiable 

A family of very popular bounded p— function satisfying R0,R1 and R2 is Tukey's bisquare family: 

pl(u) = l-(l-(^) 2 Y I(\u\<k) (27) 
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for k > 0. 

We denote by V'IjV'o an d V'i' the derivatives of p 0l Pi>Po an d pf- Put a i = Tm Mq (Gq), 
"oo = T s ^(Gq) and er = S"(Go) 

Both regression and location MM and S functionals arc studied in detail in Fasano et al. [B]. There we 
can find sufficient conditions for weak continuity and Fisher-consistency. Moreover, a weak differentiability 
notion involving the influence function of the functionals is also developed. This notion allows to obtain 
asymptotic expansions, like those required in (jlOp and The following numbers will be used to derive 
the influence functions of the regression functionals: 

am = ^>G* i>i ((y - 3( x > A)) - a Qi )/o- ) = E Ko ip'i ((u - a M )/a a ),i = 0, 1, 
eoi = E Ko [ip'i ((u - a 0i )/ao) (u - a 0i )/ao\ , i = 0, 1, 
do = E Ko [V>o ({u - a 00 )/aa) (u - a o)/o- ] and b = E Gs g(x, /3 ). 

Similarly we define a^, e^, d$ and CTq replacing ip i by ipf, Kq by Fq, g(x,/3o) by 0, aoi by poi and (To 
by Cq = Sl{Fo). We denote by A the covariance matrix of g(x,/3 ) under Qq. 

Theorems [6J and [7] summarize the results for MM functionals of regression and location, respectively. 

Theorem 6 Let po and p\ be bounded p-functions satisfying Rl, with p± < po- Assume that Kq has a 
strongly unimodal density and that holds replacing Qq by Qq. We will consider that either (a) B is 
compact or (b) g(x,/3) = /3'x and 8 < 1 — c(Gq). Then 

(i) linin-^oo Tmm,^(G*) ~/3q a.s. and therefore Al is satisfied. 

(ii) Assume also that aoo, aoi ond do are different from 0, that AO holds and that po and p\ satisfies 
R2. Then I77J)) holds with Ir(x, y) = /T M Af,/3,G:( x j2/)/E(ai) > where It mm ^gj; ( x j2/) is the influence 
function ofTMM,0 a t Gq. Moreover, we have that 

j t m „, ?; g;(x,!/) = — ipi A (g(x,p ) - b ), (28) 

aoi \ o-q J 

and therefore A2 holds. 



Theorem 7 Let p^ and p\ be bounded p-functions satisfying Rl, with p\ < pg . Assume that Fo has 
a strongly unimodal density. Then 

(i) There is only one value p i = Tmm (-^b) that attains the minimum at i26]). T^ [M is continuous 
at F , and so A3 holds. In the case that F is symmetric around is , we have poi = vo- 

(ii) Assume also AO, that pg and p\ satisfy R2 and that ag , and d^ are different from 0. Then 
ill]) holds when Il(u) is the influence function ofT^ IM at Fq. Moreover we have 

hly) . !*# (S^S!)) - *^ U (SZ£») - s) , (28) 

"01 \ a J a 01 a \ \ "0 J J 

and therefore A4 holds. 
(Hi) In case that Fo is symmetric with respect to vo we have eo = and 
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To end this Section, we state the announced result regarding the uniform bound required for the 
location functional in order to deduce a lower bound for the FSBD of ju„, introduced in Section @] 

Theorem 8 Let T^ M be an MM location functional. Then its uniform asymptotic breakdown point is 
£j!j = min(l — 5, 5). 



6 Monte Carlo study 

In order to assess how the proposed robust method compares to the classical procedure that uses as /3 n 
the least squares and as Tl the mean functional, wc performed a Monte Carlo study. We consider the 
following model 

Hi = 3xn + ... + 3x l5 + Ui, 1 < i < 100, 

where xn, ...,2^5 arc i.i.d. random variables with uniform distribution in the interval [0, 1], Ui are stan- 
dardized normal variables (ui ~ A/"(0, 1)) and f3\ = /?2 — ... = /?5 = 3. The missingness indicators ai were 
generated using a logistic model. Let Xj = (xn, ...,2^5), then 

P(a l = llxj) 
1 - P(ctj = l|Xi) 

Using this model and the distribution of the covariables, we have P(ai = 1) = 0.80. 

We study (a) the case with no outlier contamination and (b) the case where 10% of the observations 
(x, j/j)'s with at = 1 are replaced by (x*,y*), with x* = (x* , ...,x*). We take two values for x*: 1 and 3, 
and for y* we take a grid of values over the interval [8, 50], with steps of 0.20. For each case we performed 
1000 replications. We consider four functionals Tl : (i) the mean (MEAN in Figure 1), (ii) the median 
(MEDIAN in Figure 1) (iii) an MM location functional with pf = p T>ki , fc =1.57, fci = 3.88 and 5 = 0.5 
. The corresponding location estimate has a Gaussian asymptotic efficiency of 90% (MM90 in Figure 1). 
(iv) Finally we study an MM location functional defined as in (iii) with constants ko=1.57, k\ = 4.68 
and S = 0.5. This location estimate has a Gaussian asymptotic efficiency of 95% (MM95 in Figure 1). 
Note that in the case in which there is no outlier contamination, the distribution Fq is symmetric with 
center of symmetry 7.5, and then Tl(Fq) — E(y) = 7.5 in the four cases. When Tl is the mean, j3 n is 
the least squares (LS) estimate. In the other 3 cases /?„ is an MM estimate with pi = pr,kii &o=1.57, 
k\ = 3.44 and S = 0.5. This estimate has an asymptotic efficiency of 85% in the case of Gaussian errors 
and breakdown point close to 0.5. In Table 1 we show the mean square errors (MSE), and the relative 
efficiencies of the four estimates when there is no outlier contamination. In Figure 1 we plot the MSE of 
the four estimates under outlier contamination. 



Table 1. MSE and efficiencies without outliers 



Estimates MEAN MEDIAN MM90 MM95 
MSE 0.047 0.056 0.051 0.049 

Efficiency 100% 83% 91% 95% 



As expected, when there are no outliers the classical estimate based on the mean is the most efficient, 
but the estimates based on the MM estimates are highly efficient too. The estimate based on the median 
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Mean, x*=1 



Robust Estimates, x* =1 
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is less efficient, but its efficiency is larger than that of the sample median which is 64%. Note that the 
estimate based on the median is an U-statistics similar to the Hodges— Lehmann estimate, which is also 
more efficient than the median. 

When there are outliers, we observe that the MSE of the estimate based on the mean increases beyond 
any limit, while for the robust estimates the MSE remains bounded. In the case of x* = 1 the MSE of 
MM95 is larger than those of MEDIAN and MM90. For x* = 3 the MSE of MEDIAN is larger than 
those of the other two robust estimates. The MSEs of MM90 and MM95 arc practically the same. Based 
on these results we recommend to use MM90 which has a very good behavior with and without outliers. 
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7 Appendix 

The following result plays a crucial role in the proof of Theorem [T] 

Lemma 9 Let {z^} be a sequence of i.i.d. random vectors taking values in R fe and let h : K fe xR'->M 
be a continuous function. Assume that fi n is a strongly consistent sequence of estimators of 0o € R 9 . 
Denote by H n the empirical distribution at h(zi,j3 n ), 1 < i < n and by Hq the distribution of h(zi, /3q). 
Then H n converges weakly to Hq a.s., i.e. 



P(H n ^ w H ) = 1. 

Proof. Recall that weak convergence is characterized by the following property: 
H n -> H weakly & [ f dH n -> [ fdH, V/ £ 



(30) 



where Cb(R) denotes the set of continuous bounded functions. Denote by H n the empirical distribution 
at h(zi,j3o), for 1 < i < n. By the Glivenko-Cantelli Theorem, H n converges uniformly to Hq, a.s. and 
so it also converges weakly a.s. Then, it remains to find a set of probability one where 



lim 



fdH n - / fdH n 



0, V/ e C B (R). 



Observe that 



fdH n = -J2f{HziXy) , J fdH n = -^2f(h(*i,Po) 



and so 



C C X i 1 

/ fdH n -j fdH n < -£|/(ft(*i,A0) -/(K^>3o))|/ { K|<if}+2||/|| 00 -^7 {N | > ^ } 

i—l i—1 



Put Ck = {(z, (3) : ||z|| < K,\\j3 — (3 \\ < 1}. We have that /o h : Ck — > M is uniformly continuous 
and so, given e > 0, there exists S > such that if (zj,/3i) € Ck and ||(zi,/?i) — (z2 , /?2 ) 1 1 £ 5, then 
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\f(h(zi, /3i)) — f(h(zi, ^2)) I < £■ With probability one there exists a random integer no such that 
/?o I < 5 f° r all n > tiq. Then we get 



J fdH n - J fdH n <e + 2||/|| 00 i^/ { | Z! | >x} , 



for all n>riQ. Assume also that 



1 ™ 

-^/ { | Zi |>^ } ^P(|zi|>if),VK 



Then, with probability one 



lim 

n— ¥00 



fdH n - / fdH, 



<e + 2||/|| 00 P(|z 1 | >if),V £ >0,VA\ 



To get the desired result, let e — > and K — >• 00. 



The following results will be used throughout the proofs of the Theorems stated in the previous 
Sections. We start proving that the convolution preserves weak continuity. 

Lemma 10 Assume that K n Kq and R„ — > w Rq. Then K n * R n —t w Kq * Rq. 

Proof. Let (U, V) be independent random variables, both with uniform distribution on [0, 1]. Given a 
distribution function F, denote by F -1 the generalized inverse function of F, whose value at t is given 
by the infimum of the set {s : t < F(s)}. Consider U n = K~ l (U) and V n = It is known that 

(i) U n and V n arc distributed according K n and R n , respectively and (ii) U n and V n converge a.s. to 
U = Kq X (U) and V = R HV), respectively (see Theorem 25.6 (Billingsley (1995)) for details). Then 
U n + V n converges a.s. to Uq + Vq, and then the convergence holds also in distribution. The independence 
between U and V implies that U„ + V n ~ K n * R n while Uq + Vq ~ Kq * i?o, proving the Lemma. ■ 

Lemma 11 Consider {(aj,Zj)} i.i.d. random vectors, with Bernoulli a,; and Zj € M. h Then 



sup 

zGR h 



1 - 



0, as. 



ail{zi<z} — I{zt<z} — l{zi<z,ai<0}- 



Proof. Note that 

By the Glivcnko-Cantelli Theorem we have 

1 " 

sup -Y. 1 - E [ J {»i<*} 

z£R h 

and 



i=l 



0, a.s. 



sup 

zGR* 1 



1 

— ^{z;<z,a;<0} — E [/{zi<z,ai<0}] 



= 0, a.s. 



(31) 



(32) 



(33) 



(34) 



From (32), (03]) and §5Q we get 



sup 

zGR* 1 



1 - 

ri ^ — ^ L 



<z} — - z {zi<z,ai<0}J 
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and by applying (|32[) to i = 1 the Lemma follows. ■ The proof of the following Lemma is similar to 
that of Lemma 4.2 presented by Yohai in [15]. It suffices to replace the law of large numbers for i.i.d., 
variables by the same law for U statistics. 

Lemma 12 Assume that {z{\ are i.i.d. random vectors taking values in M. k , with common distribution 
Q. Let f : R k x R k x R h -> K be a continuous function. Assume that for some S > we have that 

E sup |/(zi,z 2 , A) | < oo 

|A-A ||<<5 

and that X n — > Xq a.s. Then 

^ n n 

— ^2^2f{zi,z j: X n ) -> E/(zi,z 2 , A ) a.s. (35) 
71 3=1 1=1 



Proof of Theorem [T] According to Lemma [TOl it only remains to prove the a.s. weak convergence 
of R n and K n to Rq and Kq respectively. The a.s. weak convergence of R n to Rq follows from Lemma 
[9j putting z = x and h(z, j3) = g(x, j3). Weak convergence of (K n ) n >i to Kq requires an extra argument. 
If z = (x, y) and h(z, (3) = y — g(x, (3), we get that 



1 " 
K n {u) = =j ^2 a » J {h 



By Lemma ITT1 we obtain 



sup 

«6B 



1 ™ 

- ^ ai^{ui<u} - E [ai/{ Ul <„}] 



a.s. 



Since a\ and u\ are independent, we conclude that 



sup 

tt£R 



1 ™ 

^=1 Bi i=1 



= a.s. 



and then X)"=i a il{ui<u}/ X)"=i a » converges weakly to ATo An argument similar to the one used 
in Lemma [5] shows that with probability one we have 



lim 



fdK n - / fdK n 



= 0, V/eC B (R), 



proving the a.s. weak convergence of K n to K . This concludes the proof of part (a) of Theorem [T] 
(b) is an immediate consequence of weak continuity of . □ 

Proof of Theorem [2j According to A4, we have that 

V^CPn-^o) = V^{T L (F n )-T L (F )} = y/nEpJ L (y) + o P (l). 

Note that 

_^ n n 



j=li=l 
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where i] n = 5Z i=1 ca/n. Since rj n — > E[dj] = ?7, to prove Thcorcm[2J it is enough to show that 

V n -> d N(0, (r/r) 2 ), 

where 

^ n n 

Performing a Taylor expansion, we can write 

V n = d n + c' n n 1/2 n - P ), 

where 

j n n 

" = 3/2 

»=lj=l 

and 

^ n n 
U i=l j=l 

with /3* between /3„ and /?o, and 

l{ai,-Xi,y%,aj,-3ij,yj,f}) = a % I' L (y % - g(P,Xi) + .g(/3,x i )) {flr(£,Xj) - g(/3,x,)}. 
By Lemma [T21 

c„ ->• c = E^(ai,xi,2/i,a 2 ,x 2 , y 2 ,/?o) a.s. (36) 
From the U-statistics projection Theorem we get 

1 - 

d n = ^- 7 ^^e(x i ,w l ,a l ) + /(x,) + o P (l). (37) 

z— 1 

Finally, using (fTU|) . we get that 



1 " 

I y^^e(xj,M J ,a 4 ) + /(xj) + aic'I R (xi,yi) + o P (l), 



and using the Central Limit Theorem we get (|T2l) . □ 

To prove Theorem[3] we need an asymptotic expansion for n 1 / 2 (/i rl — /xo). Let Zj = (ai,Xi,yi) and 
consider 

*i(zi,z 3 -,/3,/i) = aiSign( 5 (xj,/3) + yi - ff(xi,/3) - /i) , (38) 
Ax^/i) = E*(z 1 ,z 2 ,/3, Ai ) ,A^(AM) = A lM (/3, M ) = 8A ^' M) , 

and 

j=l »=1 

The independence between ai and («i,X2) and the fact that u\ + <j(x2,/3o) has distribution Fq, allow to 
conclude that Ai(/3o,/^o) = E(ai)Ei? sign(y — /Uq) = 0. Since ^T mod ,F (2/) is not differentiable we have to 
use an extra argument to obtain an asymptotic linear expansion for n 1 / 2 (/i„ — (j,q). To this purpose, 
the following Lemma is crucial. It is related to a very general linear expansion satisfied by empirical 
processes based on U-statistics. 
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Lemma 13 Suppose the same assumptions as in Theorem^ Then ifn 1 / 2 (/3 n — Pa) and n 1 ^ 2 (fi n — Ho) 
are bounded in probability we have that 

Jn{P n ,~P n ) = Jn(Po, Mo) + VnAlp(fio, Ho)' (P ~ Po) + a/hAi^^o, MoXMn - Mo) + O p (l). 

The proof of Lemma [TBI is based on a small number of intermediate results, being Proposition [T4l the 
most important of them. It may be considered the U-statistics version of Lemma 3 of Huber [10] . Since 
we believe that these results can be useful in many other situations, we decided to make a presentation 
in a general setting. Consider a sequence Zi,i > 1 of i.i.d. random vectors of dimension m and let 
*(zi, z 2 , 8) : K m x W n x 8 — > R p , where 9 C IR P . Note that here we are resorting to the same notation 
already adopted for the particular case considered above. 

Let A(8) =E^(z 1 ,z 2 8) and assume that A(8 ) = 0, for some 8 G BP . 

Consider 

' E?=i *) - MO) - 0o)] 



Z„(0) = 



and 



U(zx,z 2 ,6,d) 
We need the following assumptions: 



j3/2 + n 2|| A 



sup ||*( Zl ,z 2 ,r) - *(zi,z 2 ,6»)|| . 

*-9\\<d 



(39) 



CI. For a fixed 8, \P(zi, z 2 , 0) is measurable and , 5'(z 1 ,Z2,0) is separable. For the definition of 
separability, sec Huber |10j . 

C2. There exist numbers b > 0, c > and c?o > such that (i) A(0) is continuously diffcrcn- 
tiablc for \\6 — 8 \\ < do and A(8 ) is nonsingular, where A(8) is the differential matrix of A(9), (ii) 
EJ7(z 1 ,z 2 ,6»,d) < fed if ||0-0 O || +d < d and (hi) E[/ 2 ( Zl , z 2 , 8, d) < bd ii \\8 - 8 \\ + d < d . 

C3. E||*(zi,z 2 ,6»o)|| 2 < oo. 



Proposition 14 Suppose that assumptions C1-C3 hold. Then we have 

sup Z n {8) ^ p 0. 

||fl-0o||<do 



The proof is similar to that of Lemma 3 in Huber [10]. The only difference is that all the sums 
of independent variables need to be replaced by U-statistics. Moreover the U-statistics counterparts 
of UmVn and the right hand side of equation (51) in Huber [10) . must be approximated by sums of 
independent random variables using the Projection Theorem. □ 



Let now *i(zi, z 2 , 8) : R m x lR m x8-)l, where 6 C M p , and let Ai(zi,z 2 ,0) = E*i(zi, z 2 , 8). 
Take 8 = (9ai,...,6op) such that Ai(0 o ) = 0. Put 

Ux{z u z 2 ,8,d)= sup |* 1 (z 1 ,z 2 ,0*)-*i(zi,z 2 ,0)| 

\\6*-6\\<d 

and 

_ |£?=i££=i[*i(*,*i,0) - Ai(fl) - *i(*,z 3 -,0 o )] 
Denote by Ai(0) = (Au(0), ...A lp (0)) = dA 1 (9)/d9. 
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In order to prove a statement analogous to Proposition [H] for the univariate statistics Z n \ (9) , the 
following assumptions will be needed. 

Dl. For a fixed 9, ^(zi, z 2 , 9) is measurable and separable. 

D2. There exist numbers b > 0, c > and do > such that (i) Ai(9) is continuously diffcrentiablc for 
\\0- \\ < d Q andAi(6» ) ^ (ii) EZ7i(zi, z 2 , 0, d) < bd if \\9-0 o \\+d < d Q and (iii) EU^(z u z 2) 6, d) < bd 
if \\9-9 Q \\+d< d . 

D3 E* 2 (z 1; z 2 A) < 00. 



Proposition 15 Suppose that assumptions D1-D3 are satisfied. Then 

sup Z n 1 (9) -> p 0. 

||0-0 o ||<d o 



(40) 



Proof. Let Ai(9) = (An(9), ...Ai p (0))' . Without loss of generality, by D2, we can assume that An(0) 7^ 

0. For 2 < i < p, define ^ i (z 1 ,z 2 ,9) = 9 l -9 0i and consider *(zi,z 2 ,0) = (*i(zi, z 2 , 0),* 2 (zi, z 2 , 9),..., * p (zi, z 2 , 0))' . 

Doing A(0) =E*(zi, z 2 , 0), we have A(0 O ) = and 



A(0) 



_ ( A u (0) A 12 (0)...A lp (0) 







Ip-i 



Then dct(A(0o)) = An(0) ^ and it is easy to check that the remaining assumptions C1-C3 are also 
satisfied. Let Z„(0) be given by (|39|) . then by Proposition [14] we get that sup|| 6 ,_ eo || <do Z n (0) —t p 0. 
This implies (gO). ■ 



Proposition 16 Suppose the same assumptions as in Proposition \15\ and let 9 n be a sequence of esti- 
mates of 0q such that n 1 / 2 |||6' n — 0o|| = O p {\). Then 

^ n n n n 

3/2 E E VifatM = ^72 E E *i( z - z i^o) + A^cOV 72 (0n - 0o)) + o P (l). (41) 



3 = 1 i=l 



j=l i=l 

Proof. By Proposition [15] we have 

I E^i ElLl [*l( z i- z i> *«) ~ A l (*«) - *1 *j , M 
Z n l(0n) = 



n 3 / 2 + n 2 |A!(0„)| 



-> P 0. 



(42) 



Using the Mean Value Theorem we get 





-A 1 {0* n )'{0 n -0 O 


)-^ 1 (z i ,z j ,0 o )] 


















Y.UY.U^i{^^ n )-A x {0* n )'{9 n - 


O ) 


- Viiz^z^e 


0)] 




n 3/2( 1 + 


A x {0* n )'n 1 / 2 {0 n - 


■0o) 


) 



where 9* — > p 6 . Since A((9* )n l / 2 {9 n - 9q) is bounded in probability, d42j) implies 

^ n n 

^72-EE[ VI/ i( z - z ^^)-AiK)'(^-0o)-*i(z J ,z J ,0o)] -) 



p 



j=i i=i 
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and so 

- n n n n 

^572 E E *i( z - ^^») = ^572 E E *i(*.*i> *o) + A 1 (0; l )(n i /2 (5 n _ flo)) + 0p( i). 

j — l i—l j— 1 z— 1 

Finally, using the continuity of A at 80, the order of convergence of n , and the fact that B* n —> p 60, we 
get (0T]). ■ 

In the following Proposition we give closed formulas for Ai l g(/3o,//o) and Ai /J (/3o, //q), which are part 
of the expansion stated in Lemma [T3l 

Proposition 17 We have 

Ai^(/3 ,Mo) = 2E[aifc (-g(x 2 ,/3o) + Mo)(s(x 2 , Po) ~ 

and 

Ai M Q3 ,Mo) = -2»7/o(mo)- 
Proof. Let A(a; 4 ,/3) = g(*i,P) - g(xj,/3 ). Then 

A1O8, M) = r?E(sign(u x + <?(x 2 , /3 ) + A(x 2 , /3) - A( Xl , /?) - p|ai = 1) 

= ?7E(E(sign(Mi + g(x 2 , /3 ) + A(x 2 , /3) - A(xi, /3) - /j|<n = 1, x l7 x 2 )) 
= „E((1 - 2K (-g(x 2 ,l3 ) ~ A(x 2 ,/3) + A(x 1; (3) + M )K = 1). 

Differentiating the last equation we get 

Ai,?GM = -2r,E[fc (-.g(x 2 ,/3o) - A(x 2 , /?) + A( Xl , /3) + /i)(- 5 (x 2 , /3) + ff (xi, /3))]|o = 1) 
= -277E[fc (- g (x 2 ,/3 ) - A(x 2 ,/3) + A( Xl ,/3) + /i)]a = 1) 
= -2r)E[ko(-g(x 2 ,/3 ) +Mo)(-fl(x 2)( 5o) +ff(xi,A)))]|oi = 1) 
= 2E[a 1 fc (-.g(x 2 ,/3 ) + /io)(.g(x 2 ,/3 ) - ,g(xi,/3 ))] 

and 

A v (A),/i ) = -2r/E[fc (-.g(x 2 ,/3 ) +// )] 
These prove the Proposition. ■ 

Proof of Lemma 1131 By Proposition [TH we only need to verify that under the assumptions of 
Theorem 3, considering 9 =(/3,/z), 

*i(zi,z 2 ,#) = aisign(yi - g(xi,/3) +g{x 2 ,/3) - fi). 

and 

U 1 (z 1 ,Z 2 ,9,d) = SUp |*l(zi,Z2,0*)-*l(zi,Z 2 ,0)|, 

llfl*-flll<d 

then, assumptions D1-D3 are satisfied. 

Assumptions Dl and D3 follow immediately. Assumption D2(i) follows from Proposition [171 and the 
fact that fo(fio) > 0. 
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We now prove D2 (ii) and (iii). Take do = S as in AO, then if we put 

w(x)= sup \\g(x,/3)\\, (43) 
\\P-M<6 

we have Ew(xi) < oo. To prove D2 (i) and (ii), we have to show that there exist K\ and K 2 such that 
for all 9 and d with | \9 — 9 \ \ + d < do, we have 

EUi(z 1 ,z 2 ,6,d) < Kid, i = 1,2. (44) 

For that purpose, we can write 

U i[zi,7. 2 , 9, d) < sup |sign(ui + .g(xi,/3 ) - g(xi,/3*) +5(x 2 ,/3*) - ^*) 

||0*-/8||<<J,|At«-Ml<d 

-sign(«i + .g(xi,/3 ) -g(x. u 0) + g(x 2 ,0) - (45) 

Then if ||0-0o||+d < do, and ||0* -0\\ < d we get that ||0*-0 O || < do, and therefore ||^*-/3 || < do 
too. Note also that 

\(g(x lt Pq) - g(xi t p*) + g(x 2 ,p*) - //) - fo(xi, /3 ) - <?(xi, /3) + <?(x 2 , p) - //)| 

< (w( Xl ) + w(x 2 ))||/3* - /3|| + | M * - M | < (w(xi) + w(x a ) + l)d. (46) 

Let z = (w(xi) + w(x 2 ) + 1), u = 1 1*| and w — v/z. The left hand side of (|45j) is different from 
when the two arguments of the sign function have different signs. By (|4"6"]l this occurs only if 
\ u i\ 5= ( w ( x i) + w ( x 2) + l)d. Then we can write 

U 1 (z u z 2 ,p,fi,d) < 2I(w < d). 

Observe that Ez < 00 and that the density of v is given by f v (v) — ko{v) + ko(—v), which is bounded 
by 2supfco. Then, since the density of w is 

00 00 

fw(w) = J zf v (wz)dF z < 2sup/co / zdF z = 2supfcoE(z) 



we get 

EZ7i(zi,z 2 ,j8,jM) < 2P(w < d) < 4supfc E(z)d 

and 

EUl(z 1 ,z 2 ,/3,iJ,,d) < 4P(w < d) < 8supfc E(z)d, 

and so (|44|) holds with K\ = 4supfc E(z) and K 2 = 8supfc E(z). 

The expansion obtained in Lemma ITUl requires that yfn(9 n — 0o) = Op(l). The following Lemma 
shows that 9 n = (/3 ra ,/I n ) satisfies this condition. 

Lemma 18 Under the assumptions of Theorem we have that 

(a) n 1 / 2 (/I n — no) is bounded in probability, 

(b) Jn^^^O. 

Proof. Let 

_^ n n 

d*(/M = -572 Z *( z i> z i' A>, Mo)) + n^Ai^^o, po)'(0 - A)) + n 1/2 A lM (/? , mo)(m - Mo)- 

j=l i=l 
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Take e > and let \i\ n and [im be defined by 

J*0 n ,p ln ) = e and J*(^„,/i2«) 

By Lemma [T7l Ai M (/3o, /^o) 7^ 0, and it holds that 

1 



™ 1/2 (Mln ~ Mo) 



Ai M (/3 ,Mo) 



3=1 i=i 



Both the hrst and second terms on the right hand side are bounded in probability, the former by the 
Central Limit Theorem for U-statistics and the later by Assumption A2 and the Central Limit Theorem. 
Then n^-^Qlin — /jq) is bounded in probability. Thereafter, by Lemma [T51 we get 

JniPmV'ln) = E + O p {l). 

Similarly we can prove that n 1 / 2 (jl2n — Mo) is bounded in probability and that 

JniPmfan) = -£ + O p (l). 

Then since J(/3, fx) is nonincreasing in /1, by a property of the median we get that 

lim P(jEZi„ < fj,„ < Ji 2n ) = 1, 

n— f 00 

and therefore n 1/,2 (/2 n — ^0) is bounded in probability. We also have that 

P(Jn(Pn,V-2n) < J n (fin,V-n) < Jn(fin, Mln)) 1 

and therefore 

P(-2e< JniX^n) <2e)^ 1. 
Since this holds for all e > 0, part (b) of the Lemma is proved. ■ 
Proof of Theorem [3j 

(a) To prove this part of the Theorem it suffices to show that T mo( j is weakly continuous at Fq. 
Take e > and y\ , y 2 continuity points of Fq such that (Iq — e < yi < /io < 2/2 < Mo + £• Since 
_Fo is continuous and strictly increasing at Fq we have that -FoCmo) = 0.5 and there exists S > such 
that -Fb(yi) < 0.5 — S < 0.5 + S < i^o (a*2 ) ■ Suppose that i 7 ^ — >. w Fq, then there exists uq such that 
for n > n we have F n {y{) < 0.5 — <5 and F n {y2) > 0.5 + .<5. This proves that n > n implies that 
Ho - e < yi < T mcd (F n ) < y 2 < Mo + e. 

(b) Since n 1 / 2 (/2 n — /io) and n 1 / 2 (/3 n — /3q) are bounded in probability, by Lemma [T3l we get 

JniPn,%) = ^(AbMo) + y/nA. 1 p( J 3 , flo)(j3 n - (%) + V"Ai M (/3 ,Mo)(Mn - Mo)} + o p (l), 
and using Lemma IT51 (b) we get 

Vn{fi n - Mo} = — r — 7^ r- (Ai^(/3 ,Mo)'^ 1/2 (3n ~ Po) + J n (f3 ,fio)\ + o p (l) 

Ai M (po,MoJ L J 

= < +7i i / 2 c: , ( ) 0„-/?o)+o P (i), 

where 

< = — J] J] «i - Xi) + Xj-)) Xj-) - </(/?, x,)} 
" 2 3=1 <=i 



21 



and 



Then it suffices to show that 



_^ n n 



'=1 j=i 



1C = < + c^'n 1 / 2 ^ - /3 ) -+ iV(0,r 2 ). 
The proof of this result is similar to that of d n + c^n 1 / 2 (^„ — /3 ) — > d N(0, r 2 ) in Theorem[5] □ 

PROOF of Theorem [5J Let W be as in ([To) . We have to show that given t < ne 3 and s < me 3 , 
there exists K such that for any sample W* e W ts , we have that \Tl(F*)\ < K, where F* is the 
distribution constructed as in ([7]), based on W*. According to the definition of e^(Tt), it suffices to 
show that there exists M such that for any W* € Wts we have that the corresponding F* satisfies 

Pj?.(|y|<M)>l-e 2 . (47) 

Let 

Z s = {Z* = {(x*,y*) : i e A} : y*) + (x 4 , y,)} < 4- 

Since s/to < E\ we can find Mi such that 

sup HjSUCZ*)!! < Aft, (48) 
z*e2, 

and then we can find M such that 

sup sup \g{*j,P\ < M/2 (49) 

1<J"<« ||/3||<A/i 

and 

sup sup \ yi -g(-Xi,/3)\<M/2. (50) 

i£A ||/3||<Mi 

Given W* G W t , s , if 0* = (3 m (Z*), with Z* G Z s . Consider B = : 1 < j < n,Xj = x*} and 
C = {ie A: {xj,yj) = (x*,y*)}. Then #5 > (1 - £ 3 )n and #G > (1 - e 3 )m. For 1 < j < n, i G A, put 

= g(xj,/3*) + (y* - y(x*,/3*))- Then, when j G 5 and i G G, by {48]), (g9]) and ([50]), we have that 
|y*-| < M and so 

#{(*, 3) ■ 183*1 < M} > mn(l - e 3 ) 2 > (1 - e 2 )mn. 
Since there are mn pairs subindexing ■, we get that P_p. (|y| < M) > 1 — £2 and then (|4"7| 

holds. □ 

Proof of Theorem [B] The proof of this Theorem is essentially based on Theorem 7 of Fasano et. 
at. [B]. As is mentioned in Section|31 if (x,,?^) has distribution Gq, then © is satisfied with x* having 
distribution Qq and u* with distribution K$. Moreover, since by Lemma [Tl]G* —} w Gq , by parts (i), (ii) 
and (iii) of Theorem 7 of [5J with Go replaced by Gq , we get part (i) of the present Theorem. 

We now prove (ii). We start proving that for any function d such the Eg* |d(x, y)\ < 00, we have that 

E G .d(x,y) -> E G *d(x,y) a.s. (51) 
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Since 

n 

^aid(Xi,2/i) 
E G*d(x,y) = ^ = ^Oid(xi,j/i) 

and 7;„ — > 77, by the Law of Large Numbers we have that Eo*d(x,y) —>Eaid(pti,yi)/r] a.s. Since 
Eaid(xi,yi)/?7 =i?G*d(x, y) , we obtain (fBTj) . 

Put now T = (T s , Tmm, S) and let ir,G*(x,i/) be its influence function at Gq. We now prove that 

n 1 / 2 E G ./ Tl G S (x 1 y) -> d H, 

where is a multivariate normal distribution. This follows by applying the Central Limit Theorem from 

1 1 



,1/2 



E G ./ T ,G S (x,y) = — -T7i&^G 5 (x liyi ), 



rj n 7W 2 , = 



the facts that Eq'It,g* ( x , y) = 0, and the fact that under Gq, the influence function /t,g*( x , 2/) has 
finite second moments. Then all the conditions required to apply parts (iv) and (v) of Theorem 7 of 
Fasano et al. [BJ are satisfied. Then 

n 1 

n 1/2 (T MM ,MG*J - A)) = r^ 1/2 ^^^TMM.f.Gste.l/i) + Ml)- 

r-f Eai p 

i— 1 L J 

Finally, using the expression for ^t mm3 ,gj derived in Fasano et. al. [BJ, we obtain part (ii) of the 
Theorem. Part (iii) is an immediate consequence of the fact that in this case eoi = 0.D 

PROOF of Theorem [7j Part (i) follows from parts (i), (ii) and (iii) of Theorem 8 of Fasano et al.[6j. 
Let T L (F) be the complete functional 

T L (F) = (Tg (F) , Tt IM (F) , Si (F)) . 

Since {F n } is a sequence of random distribution with finite support converging a.s. to -Fb, by part (iv) 
of Theorem 8 of Fasano et al. [BJ we get that T L is weakly differentiable at {F n } a.s., and so 



T L (F n ) - T L (F ) = EpJ T L tFo {y)+o ( EpJ^^y) J , (52) 

where It l .f i s the influence function of T L at Fq. 

We prove now that n 1 / 2 E^ i 7 T i t p (y) is bounded in probability. Using a Taylor expansion, we get 

v/SE f J rVl (y) = — {D n + C'y^iX - /3 )| , (53) 

where 

n n 

D n = n- 3 / 2 J2J2 ai I T L tFo ( Ul + g(xj,pa)), (54) 

i=lj=l 
^ n n 

C » = ~~2 ^C** ' y * ' Gi ' X i '^n)' ( 55 ) 

n »=1 3=1 
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P* between (3 n and {3q and 

h(yL l ,y ll a i ,K j ,l3) = Oil' TLFo (yi - g(x t ,f3) +g(xj,p)) {g(xj,(3) - g(x it P)}. 

Assuming AO, by Lemma H"2| we get 

C„ — > C = E Go a ?; I' T L tFo (yi - g{xi,fi ) + g(xj,(3 )) {g(xj,(3 ) - g{x, h f3 )}, a.s. (56) 

Using (|53)) - (|56|) . the expansion (JTUJ) guaranteed by part (ii) of Theorem [6j and the fact that by the U- 
statistics projection Theorem {D n } converges to a normal distribution, we conclude that {y / n||i?p I t l Fq (y)\\} 
is bounded in probability. Therefore, from (|52j) we get 

Vn~{T L (F n ) - T L (F )} = VEE$J TLtFo (y)+o P (l). 

This implies 

sfa{T L MM {F n ) - T L (F )} = V^E Fn I TLuu!Fo (y)+o P (l), 

and therefore (fTTj) is satisfied with II = It^ im f ■ Finally (|2T)1) follows from formula (44) of Fasano et 
al. [6]. Part (ii) follows immediately from = O.D 



To prove Theorem [51 the following result is required. 



Lemma 19 Given M and 7 > 0, there exists M* such that Pp(\y\ < M) > 1 — (5+7 implies S L (F) < M* . 



Proof. It is enough to show that there exists M* such that S1(F, 0) < M* , where S^(F, p) is the location 
version of the object defined by ((TBj) for the regression case. 

Let M* be such that p%(M/M*) < 7/2. Suppose that S* L (F,0) > M*. By definition of S* L {F,Q), 

5 = E F p^(y/St(F,0)) (57) 



On the other hand, let A = {\y\ < M}. By hypothesis, P F (A) > 1 — 5 + 7, an d so 

EfP ° { s* l (f,o) ) - Ef/9 ° (m^) - (7/2)Pi?(A) + Pf(ac) - 7/2 + 5 - 7 - 6 - 7/2 ' 

contradicting (|57|) . ■ Proof of Theorem [HI We will prove that, given M and 7 > 0, there exists K 
such that |T^ /A/ (F)| < K, for all F with P F (\y\ < M) > min(l - S + 7, <5 + 7). In fact , note that 

EFP { S L {F) ) ~ EFP (v SHF) ) ~ EfPq { S L (F) ) 6 - (58) 



Let M* be as in Lemma 1 and let a so that p L (a/M*))(6 + 7) = 5 + 7/2. Put K = M + a and 
observe that \y\ < M and |Tf fM (F)| > K imply that |y - Tfa M (F)\ > a. Suppose that \T^ M (F)\ > K . 
Then 

EppL { V ~SHF) F) ) ~ ^ ( V ~ T t™ (F) ) ~ P r( A )p L ( a / M * ^ P L W M *)( S + l)>5 + 7/2, 
contradicting (|58l) . □ 
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